首页> 外文OA文献 >Robust multilingual Named Entity Recognition with shallow semi-supervised features
【2h】

Robust multilingual Named Entity Recognition with shallow semi-supervised features

机译:具有浅半监督特征的鲁棒多语言命名实体识别

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

We present a multilingual Named Entity Recognition approach based on a robust and general set of features across languages and datasets. Our system combines shallow local information with clustering semi-supervised features induced on large amounts of unlabeled text. Understanding via empirical experimentation how to effectively combine various types of clustering features allows us to seamlessly export our system to other datasets and languages. The result is a simple but highly competitive system which obtains state of the art results across five languages and twelve datasets. The results are reported on standard shared task evaluation data such as CoNLL for English, Spanish and Dutch. Furthermore, and despite the lack of linguistically motivated features, we also report best results for languages such as Basque and German. In addition, we demonstrate that our method also obtains very competitive results even when the amount of supervised data is cut by half, alleviating the dependency on manually annotated data. Finally, the results show that our emphasis on clustering features is crucial to develop robust out-of-domain models. The system and models are freely available to facilitate its use and guarantee the reproducibility of results.
机译:我们基于跨语言和数据集的一组强大而通用的功能,提出了一种多语言的命名实体识别方法。我们的系统将浅薄的本地信息与在大量未标记文本上引起的聚类半监督特征相结合。通过经验实验了解如何有效地组合各种类型的聚类功能,可以使我们无缝地将系统导出到其他数据集和语言。结果是一个简单但竞争激烈的系统,该系统可跨五种语言和十二个数据集获得最新的结果。结果报告在标准共享任务评估数据上,例如英语,西班牙语和荷兰语的CoNLL。此外,尽管缺乏语言动机的功能,我们也报告了巴斯克语和德语等语言的最佳结果。此外,我们证明了即使将监督数据量减少一半,我们的方法也可以获得非常有竞争力的结果,从而减轻了对手动注释数据的依赖。最后,结果表明,我们对聚类功能的重视对于开发健壮的域外模型至关重要。该系统和模型可免费获得,以方便其使用并保证结果的可重复性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号